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Adaptive treatment allocation schemes based on interim responses 
have generated a great deal of recent interest in clinical trials and 
other follow-up studies. An important application of such schemes is 
in survival studies, where the response variable of interest is time to 
the occurrence of a certain event. Due to possible dependency struc- 
tures inherited from the enrollment and allocation schemes, existing 
approaches to survival models, including those that handle staggered 
entry, cannot be applied directly. This paper develops a new general 
framework with its theoretical foundation for handling such adap- 
tive designs. The new approach is based on marked point processes 
and differs from existing approaches in that it considers entry and 
calender times rather than survival and calender times. Large sam- 
ple properties, which are essential for statistical inference, are es- 
tablished. Special attention is given to the Cox model and related 
score processes. Applications to adaptive and sequential designs are 
discussed. 



1. Introduction. Sequential and adaptive methods play important roles 
in the design and analysis of long-term clinical studies. Pocock (1977), 
O'Brien and Fleming (1979), and Lan and DeMets (1983) proposed various 
boundaries that adjust for multiple testing and are motivated by applica- 
tions to clinical trials; see also Jennison and Turnbull (2000). Zelen (1969), 
Wei and Durham (1978), and Wei (1978) proposed and studied outcome de- 
pendent treatment allocation schemes; see also Hu and Rosenberger (2006), 
Rosenberger and Sverdlov (2008), and Hu, Zhang and He (2009). Fisher 
(1998), Cui, Hung and Wang (1999), and Shen and Fisher (1999), on the 
other hand, proposed adaptive schemes under which sample sizes are re- 
estimated and adapted to interim analysis. Robins (1986) and Murphy and 
Bingham (2008) developed dynamic treatment regimes, in which treatment 
allocations are dynamically adapted to interim outcomes. 

Many long-term clinical trials and epidemiological cohort studies involve 
survival endpoints; see Kalbfleisch and Prentice (2002) and Fleming and 
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Harrington (1991) for examples of such kind and standard statistical meth- 
ods. For survival data, the log-rank test (Mantel and Haenszel, 1959) and 
proportional hazards regression (Cox, 1972) are the methods of choice. 
Counting processes and associated martingales may be used to derive desired 
theoretical properties; see Andersen, Borgan, Gill and Keiding (1993). 

Sequential and adaptive methods for survival data require simultaneous 
consideration of both calendar and survival times. Sellke and Siegmund 
(1983) and Slud (1984) established Brownian approximation to the score 
process calculated over the diagonal line, i.e. when calendar time meets sur- 
vival time. A Gaussian random field approximation to the two-dimensional 
score process in the case of two-sample comparison was established by Gu 
and Lai (1991). More general results about Gaussian random field approx- 
imation to the two-dimensional score process under the Cox proportional 
hazards regression can be found in Bilias, Gu and Ying (1997), who made 
use of modern empirical process theory to derive certain key results, bypass- 
ing martingale formulation to handle asymptotic analysis. 

To incorporate both group sequential analysis and adaptive outcome- 
dependent treatment allocation, one needs to consider a score process, to 
which the existing martingale approach or the empirical process theory is 
not applicable. That empirical process theory may not be applied is largely 
due to the outcome-dependent enrollment allocation, which results in study 
units that are not mutually independent. 

This paper develops a new theoretic framework and techniques for the 
partial likelihood score process with simultaneous consideration of calendar 
and survival times and with entry times and treatment allocation possibly 
depending on preceding outcomes. The approach is based upon expressing 
the score process as a stochastic integral of a suitably defined marked point 
process. The use of marked point processes for survival data was introduced 
by Arjas and Haara (1984) and Arjas (1989). Related subsequent develop- 
ments can be found in Feng (1999) and Martinussen and Scheike (2006). 

The paper is organized as follows. Section 2 provides basic notation and an 
introduction to the marked point process framework under calender time. It 
also gives an illustrative example involving the Cox model. The correspond- 
ing functional central limit theorems in a general setting are presented in 
Section 3. An application to the Cox proportional hazards model with time- 
dependent covariates is given in Section 4, where convergence properties 
for the corresponding maximum partial likelihood estimator are also estab- 
lished. Some discussion and more applications are given in Section 5. Proofs 
for our main results are provided in Section 6. 
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2. Notation and marked point process framework for survival 
data. 

2.1. Marked point process framework. Consider a follow up study with 
calendar time period [0, T), T < oo. Let U be the entry time for individual 
i, i > 1. For technical convenience, we assume throughout this paper that 
the Ui have no ties. Thus, without loss of generality, we assume U\ < U2 < 
• • • < Ui < ■ ■ ■ . Define the associated counting process for entry times 

(2.1) R t = J2^Ui<t). 

For subject i, let Tj denote survival time (since entry) and Ci censoring time. 
Let Tj = Ti A Ci and Aj = l(Tj < Cj), indicating failure (1) or censoring 
(0). Thus, if Aj = 1(0), then individual i experiences failure (censoring) 
at calendar time Ui + Ti. Furthermore, there is a possibly time-dependent, 
c?-dimensional covariate vector Zi = Zi(-), which may include the i-th in- 
dividual's treatment assignment and certain relevant baseline characteris- 
tics. Here, for any w > 0, Zi(w) refers to the covariate value at calendar 
time Ui + w. As usual, we assume that Zi, as a random variable, is non- 
informative for Ti and Ci (i.e. external time-dependent covariates as dis- 
cussed in Kalbfleisch and Prentice, 2002). Informally speaking, the Zi can 
be taken as a deterministic prior to information for Tj and Cj, and thus will 
be probabilistically independent of future event development and follow up, 
though it may depend on the historical information up to time U. 

With the above notation, the entire underlying collection of random vari- 
ables that may be observed with sufficiently long follow up is {Ui,Zi,Ui + 
Ti, Aj, i > 1}. The data may be visualized as a two dimensional plot of entry 
time Ui and event time Ui + Ti, with each point labeled by (Zj, Aj). Since 
it is the event, not entry, time that is of interest here, we combine Ui with 
(Zi, Aj) to form a point (Ui, Zi, Aj) marking the event time U{ + Tj in the 
mark space X . The structure of X is built as follows. The first component 
is a real number in [0, T). The third component is either or 1. The second 
component Zi is a function in T> z , the set of all right continuous functions 
on [0, T) with bounded variation, on which we use the Skorohod topology 
to define a measurable space. Thus, the mark space X = [0, T) x V z x {0, 1} 
has a naturally induced product a algebra. 

For a given t > 0, define a random counting measure on X 

Pt(lxE) = ^ 1 (Vi + fj < i, *7j e I,(Zi,Ai)eE), 

i 
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where I C [0, T) and E C T> z x {0, 1} are the corresponding Borel measurable 
subsets. Note that pt is random since it depends on random variables Ui, 7$, 
Zi and Aj, i > 1. Prom (2.1), we can rewrite 

(2.2) Pt(IxE) = J 1 (u + f u < t, (Z u , A u ) G E) dR u , 

where, with a slight abuse of notation, T u , Z u , and A u refer to Tj, Zi, and 
A, when u = Ui, which is well defined since the Ui are distinct for different 
i. 

The random measure pt(I x E) may be viewed as a trivariate function of 
t, I, and E. When / and E are fixed, pt(I x E) is a non-decreasing function 
of t, thereby inducing a Lebesgue-Stieltjes measure on [0, T). Since for fixed 
t, pt(I x E) is already a measure on X, we can combine the two measures 
together to get a joint measure on [0, T) x A" 

(2.3) p(dsdudzd5) = ^1(J7; + % G ds,Ui G du, Zi G dz, Aj = <5) 

i>l 

= l(u + r u G (is, Z u G dz, A u = 5)dR u . 

Note that the support of this measure is on {(u, s) : < u < s < T}. 

The above random measure also provides a way to define information 
accumulation over calendar time t. This is done by introducing the following 
internal u-filtration. 

F t = aip(AxIxE),Z u ,R u : 

\Ju < t,Ac [0,t],Jc [0,i],Ec V z x {0,1}}. 
A sub-cr-algebra of J~t that is of interest is defined by 
Ft,ti = <?{p(A x I x E), Z u , R u : 

Vn < At, A x I c [0,i] x [0,u],^ C £> z x {0,1}}. 

Intuitively, Tt$ represents covariate and event history up to time t for indi- 
viduals that enrolled before time i?, where < $ < i. 

Without loss of generality, we shall assume throughout that Rt and Zt 
are predictable with respect to {Ft,t > 0}, which is standard in survival 
analysis. We also need the following condition. 

Condition A. An individual's current survival probability does not depend 
on the future information of people who enrolled earlier than him/her in the 
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sense that for any s and t > s, 

(2.4) P{fi e{s,s + da], Ai\ Fvi+m-) = P(fi e{s,s + d8],A i \T Ui +,,v i -)' 
Moreover, assume 

(2.5) P{fi €{s,s + ds},Ai\T Ui+s ) = P(fi G (s, s + ds], A;| F Ui+9 , Vi ). 

Remark 2.1. Here J r u i ^ St u i ~ represents covariates and event history up 
to calendar time Ui+s for subjects 1, • • • , i— 1; J r u t+S ^u i represents covariates 
and event history up to calendar time Ui + s for subjects 1, ■ ■ ■ — 

As we mentioned earlier, pt(I x E) is non-decreasing in t. Therefore, the 
Dood-Meyer decomposition (Jacod and Shiryaev, 2003, page 66) implies the 
existence of a compensator. The following lemma provides a general way to 
obtain such a compensator and the corresponding martingale properties. 

Lemma 2.2. For any Borel set I C [0, T) and E C V z x {0,1}, there 
exists a predictable compensator q(dsdudzd5) forp(dsdudzd5), such that 

(2.6) Af t (J x E) = p t {I x E) - f [ q{dsdudzdS) 

Jo JlxE 

is a {J~t,t > 0} martingale, where q(dtdudzd5) = E(p(dtdudzd5)\J-t-). 
Moreover, under Condition A, for fixed t, 

r-t r"d/\s p 

(2.7) M tt 0(E)±pt([O,&]xE)- / q(dsdudzdS), 

JO Jo J E 

as a process in is a {7^,^,0 < •& < t} martingale. 

Remark 2.3. Here we define the basic martingale process in calendar 
time, in contrast to the usual approach of defining the basic martingale pro- 
cess in survival time. The calendar time-based approach is more natural for 
sequential analysis since interim analyses are conducted along calendar time. 
In addition, we use the entry time as the second time dimension. This is also 
natural since entry time indicates sample accumulation. 

From (2.6), Mf(-) defines a combined random measure dM s = p(ds du dz d8) 
— q(dsdudzd5) on the calendar time and mark space [0, T) x X. More for- 
mally, by a random measure on [0, T) x X, we mean a kernel mapping 
from the event space to [0,7") x X (Last and Brandt, 1995). At point 
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(s,u, z, S) G [0,7") x X, let f s ,u,z,s(t) be an Ft measurable random vari- 
able indexed by (s,u,z,5); see (2.14) for the Cox model as an example. Its 
integral with respect to dM s can be expressed as 



'0 JlxE 



i,Ui+Ti<t, 

(2.8) - I I f s>u , z , s (t)q(dsdudzd5). 

JO JlxE 

When f su z s(t) is J~ s predictable, results for martingale integration may 
be used; see Kallianpur and Xiong (1995, Chapter 3) and Jacod and Shiryaev 
(2003, Chapter II). In particular, for F s predictable f s ,u,z,s{^) with 

E [ [ flu^s&Pids du dz dS) < oo, 
Jo J X 

the above integral 

(2.9) M[(I xE)± If f s , u , z , s (t)dM s 

Jo JlxE 

is a square integrable {F,t > 0} martingale with predictable variation pro- 
cess 

(2.10) {M* (I x E))(t) = I I f^AMdsdudzdS), 

Jo JlxE 

which is useful for variance estimation. 

In general, the predictability assumption may not always be satisfied. In 
those cases, we will use dM s as a measure for sample path-wise integration, 
which is well defined in (2.8). 

2.2. Cox proportional hazards regression model. We illustrate the above 
construction through the Cox (1972) proportional hazards regression model 
with a dependent/independent enrollment process. For simplicity, we take 
Zi to be one-dimensional. For survival time Tj, the Cox model specifies 

(2.11) P(Ti >w\F Ui+w ) = exp{-^ exv{pZi(s)}\ (s)ds\ ,w > 0, 

where Aq(-) is the baseline hazard function and (3 is the regression parameter. 
In addition, we use Aj jC (.) to denote the hazard function for the censoring 
time Cj. 
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By Lemma 2.2, we can write the compensator for p(ds du dz d8) as 
q(ds du dz d5) 

l(^u > s — u, Z u £ dz) exp{f3Z u (s — u)}Xo(s — u)dR u ds s > u, S = 1; 

> s — u, Z u £ dz)X UjC (s — u)dR u ds s > u,5 = 0; 

otherwise. 

For each k = 0, 1, 2 and any # > 0, w > 0, $ + w < 7", let 

(2.12) T k (P;#,w) = Yl Z^(w)ex V (pZ i (w))l(f i >w). 

Ui<d 

We can express the log partial likelihood l((3;t) for (3 as 

(2.13) f f / (/^(s-u) 

io JO J© z x{0,l} V 

— log[ro(/3;t — (s — u),s — u)])l(6 = l)p(ds dudz dS); 



see equation (1) in Sellke and Siegmund (1983). The score process can then 
be written as 

U(p;t) = f [ [ {Z u {s-u)-Z{P;t,s-u)]l{5 = l)p(dsdudzd5) 

JO Jo Jv z x{0,l} 

[ [ fs,u,z,s(t)dM s , 

'0 Jo Jv z x{l} 

where dM s = p(dsdudzd5) — q(dsdudzd5) and f s ,u,z,s(t) = z(s — u) — 
Z(/3;t,s — u), in which z(-) is the index function z in f s ,u,z,s(t) and 

vfo + \ Ti(/3;t - w,w) 

Z(P\t,w) = r. 

T (/3;t- w,w) 

More generally, we can define a two-parameter score process with respect to 
calendar time t and entry time $ as 

rt psA-d r 

(2.14) U(fct,#) = / / / f a ,u, z ,s(t) dM s- 

Jo Jo Jv z x{l} 

Note that U(J3;t,t) = U(/3;t). 

As we mentioned in Remark 2.3, U(f3; t) is an integral along calendar time 
instead of survival time. By Lemma 2.2, we can use the martingale central 
limit theorem to obtain convergence properties for U(/3; t) in t; see Sections 
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3 and 4 for details. Through this framework, enrollment and covariates his- 
tory is expressed by the filtration T%. As a result, martingale structure still 
holds under response and covariates dependent allocation scheme, which is 
desirable for adaptive methods in clinical trials (cf. Hu and Rosenberger, 



On the other hand, the usual survival time-based approach results in the 
following score process 



where Ni(t,s) = Ail(T; < s A (t — £/*)+); see Bilias, et al., (1997). The 
underlying martingale processes are 



(2.16) mi(t, s) = Ni(t, s)- l(fi A (t - Ui) + > w) exp(f3Zi(w))\ (w)dw 



with filtration {.^(s^s > 0} containing all information up to survival time 
s and calendar time t for all subjects enrolled before t. 

Under (2.15) and (2.16), if enrollment process follows a response and/or 
covariate adaptive randomization procedure, nii(t,s) may no longer be an 
J-t(s) martingale since for the i-th subject with s < U{ < t, the enrollment 
allocation depends on the information up to its entry time Ui, that is TiJi--, 
which may not be contained in J~t{s). Similarly, the empirical process theory, 
which requires the independent allocation scheme, is also not applicable. 

Examples of adaptive design/allocation schemes include the randomized 
play-the- winner rule (Wei and Durham, 1978), dynamic treatment regimes 
(Pocock and Simon, 1975, Robins, 1986, and Murphy and Bingham, 2008) 
with survival endpoints, efficient randomized adaptive designs (Hu, et al., 
2009), and adaptive design with sample size re-estimation (Cui, et al., 1999 
and Shen and Fisher, 1999). 

3. Main Convergence Results. For simplicity of notation, we assume 
T = oo below. Following Sellke and Siegmund (1983) and Slud (1984), we 
introduce an index n to parameterize the size of the clinical trial. Thus, 
notation in Section 2 will include subscript n. Specifically, we have R UjU for 
R u , M n>t for M t , p n (.) for p(.), q n (.) for q(.), and f n]S ,u,z,5( t ) for fs,u,z,s(t). 
Additional quantities with the subscript n introduced henceforth are self- 
explained. 

Let [0, T) x X\ with X\ C T> z x {0, 1} be the sub-mark space in which we 
are interested. Note that for the Cox model, X\ = T> z x {1} and f n ;s,u,z,s(t) = 



2006). 



(2.15) 
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z{s — u) — Z n ((3;t, s — u). Let 

V n ^= / / C suz6 {t)q n {dsdudzd5), 
Jo Jo JXi ' ' ' ' 

where •& < t are the entry time and the calendar time, respectively, and 
a® 2 = aa' for a column vector a. This may be interpreted as the accumulated 
information up to time t from all subjects whose entry times are before 
ij. Note that when f n;s ,u,z,s(t) is F n ,s predictable, V n j$ is the predictable 
variation process {M f ([0,??] x Xi))(t) defined in (2.10). A natural estimator 
is 

rt rs/\'d r 

%M- I / / ff suzS (t) Pn {dsdudzd5). 
Jo Jo JXi ' ' ' ' 

For notational simplicity, when $ = t, we will use V n ^ and V nt t for V n t,t and 
V n t t whenever there is no ambiguity. 

3.1. Uni- dimensional f n . In this subsection, we consider the case in 
which /„ is real-valued. It is well-known that in a clinical trial with sur- 
vival as the endpoint, the power is associated with sample size through in- 
formation accumulated within the study period; see Friedman, Furberg and 
DeMets (1998). Let V n be the total information used in the study design. For 
the log-rank score process, Slud (1984) uses the number of enrollments for 
V n , with n being the process index, while Sellke and Siegmund (1983) simply 
take V n = n. In general, as t increases, the actual information V n ^ increases 
and may reach a planned portion of the information V n . However, due to 
interim adjustment which is common in adaptive designs, the ratio V n t/V n 
may not converge, making standard martingale central limit theorem not 
applicable for M.( j \fV^ in time scale t. To circumvent this difficulty, we will 
adopt an information-based time rescaling approach (Lai and Siegmund, 
1983). Specifically, let 

a n , v = inf{i : V n>t /V n > v}, 

an estimator of which is 

a n>v = ml{t : V n ,t/V n > v}. 

We can interpret a n ^ v as the calendar time at which a v proportion of the 
planned information has been accumulated. It is natural to expect that if 
a n ^ v < oo, then 

1 P<Jn,v rs r 
B n {v) = —= I II fn;s,u,z,6{ a n,v)dM niS 

V Vn Jo Jo J X\ 
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will converge to the Brownian motion process. If a n . v is a consistent estima- 
tor, then we expect that 

B n (v) = —= I II fn;s,u,z,5(°n,v)dM niS 

yVn JO Jo Jx 1 

also converges to the Brownian motion. The following conditions are needed 
for the above stated Brownian approximation. 

Condition B. For enrollment process R n and information scale V n , we 
require the following to be true: 

(i) Total information V n — > oo in probability as n — > oo, and for any t > 0, 

(3.1) P(fi e [t,t + dt),A i \F ntt _,V n ) = P(f i e lt,t + dt),Ai\F ntt _). 

p 

(ii) For any finite t and v, R n ^/V n < oo, a.s. and P(a n ^ < oo) — > 1 as 
n — > oo. 

(iii) For any r < oo and < s,u < r, there exists a constant K T such that 
for any Borel sets: Ac [0,r], / C [0, r], 



lim P[ sup \ / / / q n (ds du dz dd) — K T / / ds dR n u \ < I 
V A,/ J a J I Jx 1 JaJi ' > J 



1. 



Condition C. Score function f n - s , u ,z,&{t) satisfies: 

(i) fn;s,u,z,5{t) = fn;s',u',z',5(t) if S - U = s' - u' and z(s - u) = z'(s' - u'). 

(ii) For any r < oo, there exists a constant K T such that for t < r 

lhn P( sup {|/n;[/ i ,t/i,Z i ,A i (t)|+ / \fn;Ui+w,U i ,Z i ,A i (t)\dw\ < K T ^\ = 1, 
n->oo \ o<t<r ^ Jo J / 

C/i<i 

where | • | denotes the Li-norm for a vector when f n is multidimensional. 

(iii) There exists an F n:S predictable function g n ;s,u(s) such that for < 

u,s <t < t, /n; S , u ,x u ,A»(t) - 5n;s,«(s) — > uniformly asn->oo. 

Remark 3.1 (Condition B). Equation (3.1) is trivially satisfied in com- 
monly encountered cases since V n is usually determined at the beginning of 
a trial. Part (ii) means that the sample size up to any time t cannot be more 
than a multiple of the planned information V n , and that the time it takes to 
reach vV n is finite. It is easy to see that in the case of the Cox model, this 
assumption is implied by Condition 4-1 of Sellke and Siegmund (1983). Part 
(iii) means intuitively that q n (ds du) / (ds dR u ) < K uniformly in probability. 
Again, for the Cox model, it is satisfied when the covariates and the baseline 
hazard functions are bounded. 
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Remark 3.2 (Condition C). Because s — u corresponds to survival time, 
part (i) is natural in that it requires the score and covariate functions to be 
on the survival time scale; see (2.14) as an example. Moreover, if (i) is not 
satisfied, we can construct a counterexample for Lemma 3.6 below, by letting 
fn-suzsit) ~ 9n-,s,u(s) = I at the jump points of M ns and otherwise. Parts 
(ii) and (Hi) are standard and analogous to Conditions 1-3 in Bilias et al. 
(1997). 

Remark 3.3. In practice, there might be a planned final analysis time 
r n . For instance, r n may be the time at which there are n events observed 
or at which a budget cap is reached. In this case, the stopping time o~ n ^ can 
be still achieved before r n by taking a weight function w n adaptively and by 
defining w n ■ f n as the new integrand function; see Shen and Cai (2003) for 
an example related to sample size reestimation. Therefore, Conditions B and 
C may still be satisfied. 

We now state the main result for uni-dimensional f n . For any constant v, 
let D([0, v]) be the space of cadlag (right continuous with left limit) func- 
tions on [0, v] with the Skorokhod topology. Then, we have the following 
Brownian approximation of the score process. It extends the results of Sel- 
lke and Siegmund (1983) and Slud (1984) to cover the case with dependent 
entry times and a more general integrand function f n . 

Theorem 3.4. Under Conditions A, B, and C, we have the following 
weak convergence on the space D([0,v]), 

{B n (v),0 < v < v} ^ {B(v),0< v < v}, 

where B(v) is the Brownian motion process. Moreover, the convergence still 
holds with o n)V replaced by a n ^ v , i.e., 

{B n (v),0 < v < v} ^ {B(v), 0<v< v}. 

Our proof of Theorem 3.4 relies on martingale-based techniques by making 
use of the martingale central limit theorem and certain maximal inequalities. 
It consists of several major steps corresponding to the following lemmas, 
whose proofs are given in Section 6. 

When fn-,s,u,z,s(t) is T n , s predictable, results for martingales may be used 
and the martingale central limit theorem (Rebolledo, 1980) implies the fol- 
lowing weak convergence. 
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Lemma 3.5. Suppose that f n ;s,u,z,s(t) is T nyS predictable and uniformly 
bounded in probability. Then under Condition B, we have 

(3.2) {B n (v),0 < v < v} A {B(v),0< v < v}. 

Moreover, the convergence continues to hold with a n)V replaced by a n ^ v , i.e., 

(3.3) {B n (v),0 < v < v} A {B{v),0< v < v}. 

In general, with staggered entry, f n ;s,u,z,s{t) is often not F n , s predictable. 
Consequently, the corresponding integral may no longer be a martingale and 
Lemma 3.5 cannot be applied directly. The following lemma shows that it 
can be approximated by a martingale under suitable conditions. 

Lemma 3.6. Let r < oo. Under Conditions A, B, and C, we have 

sup / / 

#,te[o,T] vKi 

where g n -,s,u(s) is defined as in Condition C(iii). 



[fn;s,u,z,s(t) - g n ;s,u( s )] dM n y 
JO JXi 



Remark 3.7. Lemma 3.6 provides a tightness result similar to that of 
Lemma 2 in Gu and Lai (1991). Our use of the martingale structure along 
the calendar time allows us to apply the martingale central theorem, bypass- 
ing empirical process based approximations that may not be applicable under 
adaptive design. 

Proof of Theorem 3.4. With the preceding lemmas, it is now straight- 
forward to prove Theorem 3.4. In view of Condition B, we only need to 
consider the case when a n ^ < r a.s. with r being a big enough constant. 
From Lemma 3.6, it suffices to show that for < v < v, 
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gn;s,u(s)dM n 



converges weakly to the Brownian motion. From an argument similar to 
the proof of (3.3) (see Subsection 6.2.2), it is sufficient to show the weak 
convergence of 

1 



gn;s,u( S ) dM n,. 

JX! 



where a' n v = inf{t : /* f Q s f Xl 9n;s,u( s )Pn(ds dudz dS) > vV n }. Since g n - s ,u( s ) 
is T n s predictable, we get the desired conclusion from Lemmas 3.5. □ 
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3.2. Multidimensional f n . For the multidimensional case, the above time 
rescaling approach may not be directly applicable since we cannot scale V n ^ 
with a single growth rate in t. In other words, o n t$ is not well defined. 
Nevertheless, under the usual variance stability condition (see equation 3.4 
below), we still have the weak convergence result, which extends Gu and Lai 
(1991) and Bilias, et al. (1997). More details in the case of the Cox model 
are given in Section 4. 

Theorem 3.8. Let r < oo and assume there exists a nonrandom matrix 
function V(t, •&), such that for < $ < t < r, 

(3.4) ^AW^). 

n 

Then under Conditions A, B(iii), and C, ra -1 / 2 J^ A ^ fn<,s,u,z,s(f)dM njS 

converges to a zero-mean Gaussian process £(t, $) on {t, : < $ < t < r} 
with continuous sample path and covariance function 

E[i(h, (t 2 , 2 )] = V(h A t 2i $i A 6 2 ). 

Proof of Theorem 3.8. In view of the proof of Theorem 3.4, a key 
step for obtaining the desired weak convergence result is to establish the 
tightness result analogous to Lemma 3.6. This is shown by the next lemma, 
whose proof is given in Subsection 6.3. 

Lemma 3.9. Under the same assumptions as those of Theorem 3.8, we 
have 



1 

sup —= 

#,te[o,r] V n 



11 / fn;s,u,z,s{t) ~ 9n;s,u(s)dM ni£ 
JO JO JXi 



Ao. 



Then, Theorem 3.8 follows from Lemma 3.9 and the functional martingale 
central limit theorem. □ 



4. Cox Model with adaptive entry. In clinical trials with adaptive 
allocation rules, patients are accrued sequentially and treatment assignment 
may depend on the observed responses, leading to dependent enrollment 
processes. In this section, we use the marked point process framework to 
formulate the Cox proportional hazards model based score processes under 
response and/or covariate adaptive allocation schemes. We also discuss how 
the general results and conditions presented in Section 3 may be applied and 
verified under the Cox model. 
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4.1. Cox model with unidimensional parameter. Following the notation 
in Sections 2 and 3, we have the compensator q n (dsdudzd5) defined on 
[0, T) x X satisfying 

1(5 = \)q n (ds dudz dS) = 

l(Tn > s — u, Z u £ dz) exp{f3Z u (s — u)}Xq(s — u)dR ntU ds s > u; 5 = 1 
otherwise. 

From (2.12), for each k = 0, 1, 2, •& > 0, and u; > 0, we have 

(4.1) = / l(T u >w)Z*(w)exp{f3Z u (w)}dR n!U . 

Jo 

The score processes as defined in Subsection 2.2 become 

U n (p;t) = [ [ [ [Z u {s-u)- Z n (P;t,s-u)]dM n , s , 
Jo Jo JXi 

rt rsAd r 

U n {p-t,<&) = / [Z u (s-u)-Z n (/3;t,s-u)]dM n7S , 

Jo Jo JXi 

where X\ =T> Z x {1} and 

Z n {(3;t,w) = - — -— r. 

r n ,o(p;t - w,w) 

The following conditions for the Cox model imply Condition C in Section 

3. 

CI. For every i, Zi(-) is bounded and of uniformly bounded variation in the 
sense that for any constant r < oo, there exists a nonrandom constant K T 
such that for any subject i, 

lim pfsup {1^(0)1 + f T \Z t (ds)\} < K T ) = 1 
n-»oc y j J Q J 

C2. For any w and i?, there exists a J- n ,w- measurable random variable (or 
constant) En^i'&jw) such that 

J2 Z^(w)eMPoZ i (w))E[l{f i > w)\F n , TtUt J\ - E n>k (#,w) A 0. 
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Remark 4.1. Condition C2 states the stability condition. For the special 
case of independent enrollment allocation, Rt is non-informative for T and 
Condition C2 holds naturally under the new filtration {J-' nt ,0 < t < T} 
defined by T' n t = T n $ U a{R riyS , Vs > 0}. 

Theorem 4.2. Suppose that V n is chosen to be n and that f n . s u z g(t) = 
z{s — u) — Z n (f3;t,s — u). Let a n)V be defined as in Subsection 3.1. Then 
under Conditions A, B, CI, and C2, for any v, n _1//2 C/ n (/3o; &n,v) converges 
weakly to the Brownian motion process in v E [0, v], where (3q is the true 
parameter value. 

Proof. By Condition B(ii), we only need to consider the case when 
o~n,v < T ) where r is a large constant. Since Conditions A, B, and C(i) 
in Sections 2 and 3 are satisfied, in order to apply Theorem 3.4, it suffices to 
show that Conditions C(ii) and C(iii) hold, i.e., Z n (0o;t,w) is of bounded 
variation and converges to an T n ^ w predictable Z njP (f3o;w) uniformly. This 
is equivalent to uniform convergence of Z^^d + w, w) with respect to # 
and w. 

On {w, ■& : w > 0, ■& > 0, w + ft < r}, from (4.1), T nyk {(3 ; ft, w) can be 
expressed as an integral with respect to p n (ds dudz dS): 

roo r 

^n^iP^ft^w) = III z k (w)exp(f3 z(w))I(s - u>w)p n {dudsdzd5). 

JO Ju JX 

For k=0, 1 and 2, we know that for < ft < r — w, 

M n k{ft) = -(r n , fc (/3 ; ft,w) 

E[l{f u > w)Z^(w)eM^ZuH)\T nX , u -]dR u \ 

as processes in ft are J- n ,T,ti martingales. A simple application of Lenglart's 
inequality (Lemma 6.3) gives that for any w, 

(4.2) sup M„ )t (^0, k = 0, 1,2. 

0<l)<T 

Let e > and define stopping time a = inf{ft : R n ^/n > e}. Condition 
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CI implies that for any r\ with < r/ < e, 



(4.3) sup 



— E[l(fi>w)Z^w)exp((3 Z t (w))\T n 

l«n,^i-- a n,,? 2 l^» 1 

— Y, E[l(f l >w)zHw)exp(/3 Z i (w))\T n , r 



TJJi 



R 



< 2K*exp(p K T )r)/e, 
which gives the tightness result for 

Y E[l(fi > w)Zf(w)e^(PoZ i (w))\J c n>T ,u i -]/Rn^ 

Ui<# 

on {$ : a < $ < r}. Therefore (4.3) together with Conditions CI and C2, 
implies 



(4.4) sup 

<2<19<T 



Rn 



— Y E[l(fi > w)Zf(w)eMf3oZi(w))\T n , T ,i 



0. 



From (4.2) and (4.4), for any w, as n — > oo, 

r n ,fc(/3o;tf,w) 



sup 

a<i?<7 



R 



0. 



Since exp((3Z u (w)) and Z u (w) ex.p([3Z u (w)) are of bounded variation for 
w 6 [0 , r], so is sup a<) y <T r n fc(/?o; i9, w)/R n a. Therefore 



sup sup 

w a<i?<r 



r n ,fc(/3 ;t?, w) 



R 



E n ,k{®,w) 



0. 



which indicates the uniform convergence of Z n (/3o;# + w,w) to an 
predictable Z njP (/3o; u;) on {w, $ : io + $ < r, ■# > a, u; > 0}. 

For $ > 0, to show the bounded variation of Z n in w, let = uj\ < ■ ■ ■ < 
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w. 



no 



t be a partition of [0, t]. For 1 < i < no, 

r n ,i(A);tf, Wi+i) r ni i(/? ;??,u;j) 



< 



< 



r n ,i(/go;^,Wi) - r n; i(/3 ;^,-»; i+ i)| 

r n ,o(/3o;^,^i) 

|r n ,o(^o;^,Wi) - r n;0 (/3 ;^,w i+ i)| • |r n! i(/3 ;^,w m )| 

r n ,o(/3o; ^i) r n,o(/3o; ^, ^+1) 
|r n ,i(/5o;^,^) - r nj i(/3 ;i?,Wi+i)| „, 



+ 



1C 



, |r n , (/3o;^,Wi) — r nj0 (/3 ;i?, 

+ 5 K r, 

where K' T and K'l are constant depending only on r. The desired conclusion 
then follows from Condition CI. 

Let B n (v) = n _1 / 2 C/ n (/3o, & n ,v)- Since f% is bounded, say by constant M, 
we have J Q a fndp n / n < Me. Take e small enough and let e' = Me; then 
o" n>6 / > a. Therefore, in view of Lemmas 3.5 and 3.6, we get 

{B n (v),e < v < v} A {B(v),e <v<v}. 

For the convergence property of the tail part {v : < v < e'}, we only 
need to show that for any rj, r/i > and all big n, 



P(sup \B n (v) - B{v)\ > 7?i) < rj. 



v<e' 



Since B(v) can be bounded near 0, it suffices to show 

P(sup\B n (v)\ > m/2) < ri/2. 

v<e' 

This can be done using an integration by parts argument similar to that of 
the proof of Lemma 3.6. □ 



4.2. Cox model with multidimensional regression parameter. For a p- 
dimensional covariate vector Z and corresponding vector (3, the notation 
and conditions in Subsection 4.1 generalize naturally to the multidimen- 
sional case. Specifically, we use Z® k for Z k , where Z®° = 1, Z® 1 = Z, and 
Z® 2 = ZZ' ', and we take the | - ] in Condition CI as the L\ norm for a 
p-dimensional vector. Under these modifications, Y n ^(j3\-Q,w) becomes 

(4.5) f l(f„ > w)Z r f{w) ex V {(3'Z u (w)}dR ntU . 

Jo 
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Additional quantities henceforth are self-explained. 

To derive the weak convergence result analogous to Theorem 3.8, we need 
the following condition; see Conditions 2 and 3 in Bilias et al. (1997). 

C3. For each k = 0, 1, and 2, there exists a non-random Ek($i w) such 
that E\{-,w)/Eq{-,w) is continuous on [0, r — w] and as n — > oo, 

^(tf, ra )-B fc (tf,™)Ao, 
n 

for all positive w satisfying $ + w < r, and 



sup 

0<t<T Jo 



£ n ,i(fl,t-fl) E^t-ti) 
E n Mt-#) Eo(0,t-#) 



i 2 



dd -> 0, 



Theorem 4.3. [/nc/er Conditions A, CI - C3, n~ 1 / 2 C/ n (/3 ; t) converges 
weakly to a vector-valued zero-mean Gaussian process £ on [0, r] toifft con- 
tinuous sample path and covariance function E[£(ti)£' fa)] equal to 



tiAta 



-Eb(*i A *2 — w,w) 



\o(w)dw. 



Moreover, n~ 1 / 2 f/ n (/3 ; t, i?) converges weakly to a vector-valued zero-mean 
Gaussian process i?) on {t, i? : < # < i < r} with continuous sample 
path and covariance function E[£(t\, u\)^'(t2, u?)] equal to 



E2(u tl M 2 ,w,w) 



Ef 2 {u h At 2 ,w,w) 



Eo{u tlA t 2 ,w,w) 
where ut 1 At 2 ,w = ui A 112 A (ti A t<z — w). 



\o(w)dw, 



Proof. Thanks to Lemma 3.9, it suffices to prove that for every no > 



and partition < u± < 



< u, 



no 



(/3o;t,u no ), < t < t} converges weakly to a multivariate normal distribu- 
tion • • , i(t,u no )}, where 



o Jo 



•Vi 



Z u (s - u) 



E n ,i(t - (s - u),s - u) 
E n ,o{t -(s-u),s- u) 



M n (ds). 



From Lemma 2.2, {U n ((3 ; t, Uj), T n ,ti < t < r} are martingales with pre- 
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dictable covariation processes 



Jo 



Ui Auj As 



Z u (s - u 

Xi 

E n ,l(t-(S 



u), s 



u) 



E 2 (u t , w ,w) 



E n ,o(t - (s 

E 1 [U tjU 



— it 
,w) 



E (u t , w ,w) 



s — u) 
Xo(w)dw, 



q n (ds du dz dS) 



where ut w = uif\u 2 /\{t—w) and the convergence follows from Condition C3. 
Then, we can apply Rebolledo's (1980) martingale functional central limit 
theorem to obtain the desired weak convergence result in the same way as 
in the proof of Lemma 3.5. □ 



4.3. Convergence of the maximum partial likelihood estimator. Let the 
Cox partial likelihood estimator j3(t, $) and $(v) be solutions to U(/3; t, #) = 
and U((3; & n ,v) = 0, respectively. In this subsection we give uniform consis- 
tency and weak convergence for the sequentially computed maximum partial 
likelihood estimator and f3. 

We need the following condition, which ensures the presence of enough 
information, to gain the uniform consistency: 

C4. There exists tq E (0, r] such that 
1 f T ° 



liminf X min 

n— ¥oo \ n 



JO JXx 



r, , \ E n i(r - (s - u),s 
Z u (s - u) 



E n ,o( T o - (s -u),s- u) 
q n (dsdudzdS) I = vq > 0, a.s., 



where E n & is defined as in Condition C2 and \ m in{A) denotes the minimum 
eigenvalue of a symmetric matrix A. 

Theorem 4.4 (One dimensional Cox model). Under Condition C4 and 
the same assumptions as those of Theorem 4-2, for any v, (j(v) is uniformly 
consistent in the sense that 

lim sup \j3(v) — /?o| = 0, in prob. 

Moreover, {^/nv(f3(v) — (3q),Vo < v < v} converges weakly to the Brownian 
motion process B(v). 
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Theorem 4.5 (Multidimensional Cox model). Let r < oo. Under Con- 
ditions A, CI, C2, and C4, (3(t, $) is uniformly consistent in the sense that 

lim sup \\(3(t, #) — Po\\ = 0, in prob. 

n ^°° T <-d<t<T 

Moreover, ifC3 is also satisfied, {y/n{(3(t, •&)— /3 ), < $ < t < r} converges 
weakly to a vector-valued zero-mean Gaussian random field ij with covariance 

E[r ] (t 1 ,u 1 ) V '(t 2 ,u 2 )]=E- 1 [i®\t 1 ,u 1 )]E[£(t u u 1 )^ 

where ^ is defined as in Theorem 4-3. 

To prove Theorem 4.5, we need the following Lemma, which is a restate- 
ment of Lemma A. 5 in Bilias, et al. (1997). 

Lemma 4.6. Consider a set of functions {f n ^ a : n> 1, a € ^4} from R d 
to R d . Suppose that (i) ^/ n ,«(^) are nonnegative definite for all n, a, 6; 
(ii) sup a ||/n,a(0o)|| ~~ > as n — > oo; (Hi) there exists a neighborhood of 6$, 
denoted byN{9o), such that 

'df n , a (ey 



liminf inf inf \ m i n 



>o, 



where \ m in is the minimum eigenvalue as defined in C4. Then there exists 
tiq such that for every n > uq and a £ A, f n ^ a has a unique root 9 n ^ a and 
sup a6 A \\9n,a — <5>o 1 1 — > 0. 

Proof of Theorem 4.5. We only need to prove the multidimensional 
case. From the same argument as in the proof of Theorem 4.2, as n — > oo, 

1 



sup 



n 



0, 



and 
(4.6) 



1 

sup — 



d_ 

d/3 



U((3 ;t,#) 



t r sA$ 



JO 



Z u (s - u) 



E n ,i(t - (s - u),s - u) 
E n ,o(t - (s-u),s- u) 



q n {ds du dz dS) 



0. 



Since ^^[/(/3 ; t, ??) has a uniformly bounded derivative with respect to 
j3, Condition C4 and (4.6) imply that there exists a neighborhood of /3 , 
A/"(/3 ), such that in probability 



T <tf<t<T/3eA/'(/3 ) 



1 d 



n dp 



(4.7) liminf inf inf X min ( --£-U(/3 ; t, 0) ) > — > 0. 
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Therefore, by Lemma 4.6, we get consistence. 

By the Taylor expansion, uniformly on < i? < t < t, 

in 



1 1 c) 



where o p is uniform for tq < •& < t < t. From this and Theorem 4.3, the 
weak convergence of v / ra(/3(t,i?) — /3 ) follows. □ 



5. Discussion. General statistical methods and theory usually assume 
observations from different study subjects are independent. In practice, such 
an assumption may be violated. This paper deals with survival studies in 
which patients' entry and treatment allocations are adaptive and dependent 
on previous outcomes. Through carefully defined marked point processes, 
it provides a general framework under which a martingale-based approach 
is developed. It is shown that the usual score process for sequential data 
monitoring (Jennison and Turnbull, 2000 and Proschan, Lan and Wittes, 
2006) can still be approximated by a time rescaled Brownian motion process 
that is the theoretical cornerstone for modern group sequential methods for 
clinical trials. The results establish a bridge between sequential analysis 
for survival endpoints with staggered entry (Sellke and Siegmund, 1983, 
Slud, 1984, Gu and Lai 1991 and Bilias, et al. 1997) and covariate/response 
adaptive treatment allocation designs (Hu and Rosenberger, 2006). Specific 
details are given for the Cox model based score processes. 

The theoretical framework and asymptotical results developed in this pa- 
per may be extended to other follow-up studies with more general outcome 
variables. For studies with longitudinal outcomes, dynamic regression mod- 
els have been proposed and studied (see Martinussen and Scheike, 2000). 
Consideration of staggered entry and outcomes dependent allocation could 
complicate the analysis considerably. It is hoped that the approach of this 
paper will provide a basis for developing a new way to handle such study 
designs. 

6. Proofs of the Main Results. 

6.1. Proof of Lemma 2.2. Following Chapter II in Jacod and Shiryaev 
(2003), random measure p(-) has a Tt predictable compensator </(•) of form 
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E(p{dtdudz d5)\Ft-), and 

(6.1) M[{X)= I I f StU>ZjS (s)(p(dsdudzdS) - q(ds du dz dS)) 
Jo Jx 

is a martingale for any T s predictable and integrable function f suz s(s) 
Thus, the first part of Lemma 1 follows. 

As to the second part, for fixed t and any < u\ < U2 < i, 

E(M t>U2 (E) - M t , Ul (E)\T t , Ul ) 
= E([ [ [ (p-q)(dsdudz d5)\F t , Ul ) 

JO J(ui ,uo] J E 



0. 



which is due to Condition A and the first conclusion in this Lemma. There- 
fore the desired result follows. 

6.2. Proof of Lemma 3.5. We separate our proof into two parts. In Sub- 
section 6.2.1, we prove (3.2). (3.3) is proved in Subsection 6.2.2. 

6.2.1. Proof of (3.2). Since f n -,s,u,z,s{t) is assumed to be T n%3 predictable, 
denote f n -,s,u,z,&{t) by f n ;s,u,z,s(s) when there is no ambiguity. Consider the 
new filtration 

F' n ,t = °~{Fn,U Vn}- 

Under this filtration, since R u , Z u are predictable, 

E(p n (ds du dz dS)\ Jv^) 
= E(l(u + f u e ds,Z u e dz,A u = 8)dR u \T' n t ) 
= E(l(u + f u e ds, A u = 5)\JF' n)t )l(Z u e dz)dR u , 

which equals, by Condition B(i), the compensator for p n (dsdudzd5) under 

q n (dsdudzd5) = E(l(u + T u G ds, A u = 5)|J r nji )l(Z u G dz)dR u . 

Then B n (v) is a martingale with respect to our new filtration by Lemma 
2.2. 

Note that (B n )(v) = v. From the central limit theorem for martingales in 
Rebolledo (1980), we only need to show that the quadratic variation process 
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of any e jump process converges to zero in probability. For any e > 0, the e 
jump process 

= J2 AB nHl(\AB n (w)\>e) 



JO JXi 



'Vr,. 



+ - 



n JO JO JX 



1 



n JO JO JX 



/n;s,M,2,5( , s)l 
fn;s,u,z,s(s^) 1 



fn;s,u,z,si.^) 



VVn 

fn;s,u,z,s{s 



> ej p n (ds du dz dS) 

> e^j dM nyS 

> e I q n (ds du dz dS) . 



Thus, as f n is uniformly bounded in probability 

fn;s,u,z,&(s) 



(Bn,e)(v) 
1 



Vn 



fn:s,u,z,s{ S )l 



n JO JO JX X 

0, as n — )• oo. 



> e I q n (dsdudzd5) 



Therefore {B n (v), < v < v} A {B(v),0 < v < v}. 

6.2.2. Proof of (3.3). Similarly we denote f n -s,u,z,s{t) by f n ;s,u,z,s{s), 
which is J- ntS predictable. By Condition B (ii), we only need to consider 
the case when a ni y, a n ^ < r for some constant r < oo. 

Martingale 

(6.2) m n (t) = ^-[ I f%. suzS (s)(p n (dsdudzd8) - q n (ds du dz d5)) 

V n JO J X 
p 

satisfies (m n (r)) — > for uniformly bounded f n under Condition B. From 
Lenglart's inequality, we get 

P 

sup m n (t) — > 0. 

0<t<T 

Therefore for any 5 > 0, the definition of a n)V implies that the following 
holds uniformly in probability as n — > oo 

rcrn,v r 

(v - 6)V n < / fn- s ,u,zA S )Pn( ds du dz d $) ~ m n{o- n ,v) ' V n , 



JX 



24 LUO, XU AND YING 



(v + 5)V n > / / f n . suz5 {s)p n (dsdudzd5) - m n {a n>v ) -V n , 
Jo J x 



W n - /*"■» f x fl >s>u>z>s (s)q n (ds du dz dS) 



< 5V n . There- 



which indicates 
fore, 

P(o~n,v-6 < &n,v < <r n>v +s,Vv G [0, v\) ->• as n ->• oo, 
and the following holds uniformly in probability as n — >• oo 

sup \B n {v) - B n (v)\ < sup \B n (t) - B n (s)\. 

0<V<V 0<s,t<v 

\s — t\<5 

Therefore, from Lemma 3.5, for any e, rj > rji > 0, there exists 5 > such 
that for big enough n, 

P( sup - B n («)| >e) <P( sup |J3 n (i) - B„(s)| > e J + ??i < rj, 

^0<v<v J V 0<s,t<v J 

\s — t\<5 

which completes our proof. 

6.3. Proof of Lemma 3.6 and Lemma 3.9. We only need to prove Lemma 
3.6; Lemma 3.9 follows from the same arguments. Since A is discrete, with- 
out loss of generality, it suffices to consider the sub-mark space X\ with 
5 = 1. For statement convenience, we use f n (t,s,u) for f n ;s,u,z u ,A u {t) and 
g n {s,u) for g n ;s,u( s )- 

Note that p n (ds dudz, 8 = 1) = l(u + T u G ds,Z u G dz,A u = l)dR njU . 
From Condition A and Lemma 2.2, 

q n (ds du dz , 5 = 1) = E(p n (dsdudz,5 = l)\J 7 ntS -) 

= E(l(u + f u G ds, A u = l)\T n , s -)l(Z u G dz)dRn >u . 

From the above, we can define a new counting measure and its compen- 
sator on [0,t] x [0,t] by p* n {dsdu) = l(u + T u G ds,A u = l)dR n>u and 
q* n (dsdu) = E(l(u + T u G (is, A u = 1)| F n>s -)dRn tU . Note that p* n (dsdu) = 
fx Pn{ds du dz d5) and q^(dsdu) = j x ^q n (dsdudzd5). From Lemma 2.2, 
we get the martingale measure dM* s = p* n {dsdu) — q^(dsdu), and for any 
T n t measurable and integrable f n , 



/ / / fn;s,u,z,s{t)dM ntS = / / 

Jo Jo J x 1 Jo Jo 



t ("& 

f n (t,s,u)dM* 



When there is no ambiguity, we use the notation, p n (ds du) , q n (ds du) , and 
M n ^ s , instead of p* n {ds du), q^(dsdu), and M* s . 
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Let p n {ds , u) = I(u + T u £ ds) and q n (ds,u) = E{p n (ds,u)\F n ^-), which 
are the counting measure and the corresponding compensator for the subject 
who enrolled at time u. Then Lemma 2.2 implies that 

M n (ds, u) = p n (ds, u) - q n (ds, u) 

is a martingale measure on [u, r], which defines a basic martingale measure 
for each subject in the sense that if u = Ui, M n (ds,u) = I(Ui + T. L G 
ds) — q n {ds, Ui); see (2.16) as an example for the Cox model. Let M n> t(u) = 
f*M n (ds,u), which is the total measure of interval [u, t] under M n (ds,u). 
Let M n , t (du) = [£M n (ds,u)} ■ dR n u , which defines a martingale measure 
along entry time for all subjects who enrolled before time t. 

Denote M n ^ ) -d{Xi) in Lemma 2.2 by M nt #. It can be taken as a mar- 
tingale along both calendar and entry times, i.e., M n ^ y $ = Jq J As dM n)S is 
a martingale in t for any t? and M n ^$ = J M n j(du) is a martingale in 
■& for any t. When $ = t, we have M n ^^t = Jq Jq dM UjS , which is M n ^{Xi). 
Similarly, define random integral M n ^ w ^ with respect to survival time w and 
entry time $ by 

M n>w ^ = j M njW+u (du) \ = J M n)W+u (u)dR n) ^j . 

Note that M n . w a is defined on the information observed before entry time 
$ and survival time w. 

We now proceed to prove Lemma 3.6. The following two propositions play 
a key role, and we will give their proofs in Sections 6.3.1 and 6.3.2, respec- 
tively. Proposition 6.1 shows the tightness for M n ^^j\/Vn along calendar 
and entry time. 

Proposition 6.1. Under Conditions A and B, for any e > 0, there exist 
a constant no < oo and partitions = u n fi < u n ,\ < • • • < u n ^ no = t, which 
may be random, such that for all large n, 

P[ max sup \W n , t ,ti - W nt , Un A > e ) < e, 

\0<j<n ■t>e[u n , j ,u ntj+1 h ' J 

0<t<T 

where W n>t) & = M n>t ^/V%- 

The following proposition shows the tightness property for M n ,w,tf/ V^ri 
along survival and entry time. 
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Proposition 6.2. Under Conditions A and B, for any e > 0, there 
exist partitions = wq < W\ < ■ ■ ■ < u>n = T and = u n $ < u n \ < ■ ■ ■ < 



u 



t such that for all large n, 



n,no 



P[ max sup \W nw 0-W n)WhUn .\>e)<e, 

\ 0<]<n ^ g r , ' I 

V 0<fc<iV n 1 n J + 1J 7 

W6[w fc ,w fc+1 ] 

where W n>w ^ = M n>w> #/ '04- 

Proof of Lemma 3.6. To show the uniform convergence result, note 
that for any < ?9 < t < r, 



t r sA$ 



fVn JO Jo 
I r$ ft 



(f n (t,s,u) - g n (s,u))(p n (dsdu) -q n (dsdu)) 
(f n (t,s,u) - g n {s,u))(jp n (duds) -q n (duds)) 



n JO 



(f n (t, s, u) - g n (s, u))M n (ds, u) 



dRnu • 



Since the total variations of f n and g n are bounded and M n)S ^ u is a mar- 
tingale with jump no bigger than 1, quadratic covariation [f n (t,-, u) — 

g n (-,u), V n 1//2 --M nj . jU ] (t) converges to uniformly in probability; then, using 
integration by parts, we have 



(f n (t,s,u) - g n (s,u))M n (ds,u) dR n , u 



1 

^%Jo 

ft 



lS r 



M n j(u)(f n (t,t,u) - g n (t,u)) - M ntU (u)(f n (t,u,u) - g n (u,u)) 



M n s (u)(f n (t,ds,u) - g n (ds,u)) dR n ^ + o n (l) 



1 







Vn JO 



(f n (t, t, u) - g n (t, u))M n>t (du) 

M n<s (u)(f n (t,ds,u) - g n (ds,u)) dR n , u + o n (l) 



n JO 



, (6.3) 



where o n (l) is a small term converging uniformly to in probability. 

Consider the first term in (6.3). For any e > 0, Proposition 6.1 shows that 
there exists a partition = uo<u n i<---<u 



n,no 



t such that for all 
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sup \M n ,t,u n , i+1 ~ M ni t, u \/V V n < e- 

!;«e(«n,!,«n,!+l] 

Therefore, again by integration by parts, the following result holds uniformly 
on < i? < i < t for all large n, with probability bigger than 1 — 2e: 



1 



/V n J0 
1 



(f n (t, t, u) - g n (t, u))M n t {du) 



1 



(fn(t,t,&)-g n (t,0))M nt t,t ~ -=(/ n (t,i,0) -9n{tM M n,t,Q 



Vn 



M nt t,u(fn(t, t, du) - g n (t, du)) 



+ o n (l) 



< 



:(f n (t,t,0)-g n (t,#))M n>tt 



no 



< 3e • K T , 



i=l 



M ntt ,u nti+1 {fn(t, t, du) - g n (t, du)) 



+ 2e-K T 



where K T is the total variation bound for f n (t,s,u) = f n (t,s — it, 0), and 
the last step follows from the Lenglart inequality. 
For the second term in (6.3), by Condition C(i), 



1 



'Vn JO Ju 



'V n JO Ju 

1 r ri t ~ w )^ 
'o 



V n JO 



M n>s (u)(f n (t,ds,u) - g n (ds,u))dR n , u 
M n>s (u)(f n (t,d(s - ii),0) - g n (d(s - u),0))dR n)U 

,w+u 

(u)dR n , u (f n (t, dw, 0) - g n (dw, 0)). 



Recall that we let M n>v}} # = f Q M n , w + u (u)dR ntU ', then, from Proposition 6.2, 
there exist partitions = Wq < u>i < ■ ■ ■ < wn = r and = u n $ < u n> i < 
• • • < i*ra,no = T such that 



1 



sup 



i,j;we[w i} w i + 1 ), yV n 
n e[»B,j'"nj+l) 



\-^n,w,u ^-n,Wi,u n ,j \ ^ 



then, similar to the proof for the first term in (6.3), we get that the following 
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holds with probability bigger than 1 — 2e for all large n: 

I / M n , w+u (u)dR n>u (f n (t,dw,0) - g n (dw,0)) 



Vn JO L J0 

No no 



< 



1 



EE 



i=l j=l 



/ (f n (t, dw, 0) - g n (dw, 0)) 



m-i 



+ 2e • K T 



< 3eK T . 

Therefore, combining the above inequalities, we have that for all large n, 



P sup 



1 



0,te[O,r] V^ri 



t rsAti 



JO 



f n (t, s, u) - g n (s, u)dM n 



which completes our proof. 



< 6eK T + ej > 1 - 5e, 
□ 



6.3.1. Proof of Proposition 6.1. We shall make use of some of the ba- 
sic martingale inequalities given in the following lemma, which is due to 
Lenglart, Lepingle and Pratelli (1980). 

Lemma 6.3. Let {W(s),Q(s),s > 0} be a martingale with right continu- 
ous paths and left limits. For any q > 1, there exists a constant C q depending 
only on q, such that 



(6.4) E I sup \W(s)\ q J < C q ( E[{W){T)] q/2 + £(sup | A W(s 

\S<T J \ S<T 

Moreover, if sup s<T | A W(s)| < c, then for any a,b > 

P (sup\W(s)\ > o,{W){t) < bj < 2exp (-^{ac/b) 
where tp(x) = 2x~ 2 {(l + x)[log(l + x) — 1] + 1}. 



Proof of Proposition 6.1. Choose positive numbers p, q > 1 such 
that pq/2 — p — q > 1. Let uo = and define u n j inductively by 

u n>j+1 = inf{tf : ■& > u nJ , 2rK T (R n ^ - R n , Un J > ^V n ] A (u n>j + e p ) A r. 

Condition B(ii) implies that there are maximally 0{e~ p ) many, say no, dis- 
tinct points in [0,r] for all big n. From Lemma 1, {W n t$, J~n,ti i > 0} is a 
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martingale, and we know that u n j, j = 1, • • • , no, are {F n ,t, < t < r} pre- 
dictable. Thus, {sup# £[Un ^ Un j+l] iWn^-Wn^UnJ^n.ut > 0} is a nonneg- 
ative submartingale. By the Morkov inequality and Doob's (1953) maximal 
inequality, 



P max sup \W ni t,# - W n>t , Un A > e 

\0<j<n mu nij ,u nJ+1 ]; 

0<t<T 

1 710-1 / \ 

- E [ su p \w n ,t,6 - w n ,t,u nJ r 



0<t<T 



no— 1 



^ (1.0 — -l / \ 

^ ^ E (^ij 



Since { W„ jTi ^, J n ,T^, "& > 0} is a martingale and sup^ e[Wn ^ Un A|W n>r ^- 
W n , T , Un J < ±±f^, then following (6.4), 



-■ no — 1 / \ q 

^E (^rj El 



j = Q W / V d£[Un,j, U n ,j+l] 

no-1 

< 



1 _° / \ 1 I 1 _|_ X{ \ 

~ q E ( Try ) <M ^[(^, Un , + .)K,- + i - u nij )]^ 2 + 
j=o / \ v n / 

< C* q (e) pq/2 ~ p - q < e, 

where C* is a constant depending only on g and the last inequality holds 
when e is small enough. Then the desired result follows. □ 



6.3.2. Proof of Proposition 6.2. We need the following lemma (see Lemma 
5 in Gu and Lai, 1991). 

Lemma 6.4. Let q > and r > 1. Let {W n ,n > 1} be a sequence of 
random variables defined in the same probability space and let {g n } be a 
sequence of nonnegative integrable functions on a measure space (X,B,fi). 
Suppose that for every fixed x € X , g n {x) is nondecreasing in n < N and 
that 

E\Zi - Zj\ q < (J\9i{ x ) - gj {x)]d^{x)^ for alll<j<i<N. 



30 LUO, XU AND YING 

Then there exists a universal constant C q . r depending only on q and r such 
that 

E ^sup \Zi - ZjlJ < C q . r (^J [§n(x) - gi(x)]did(x)^ . 

Proof of Proposition 6.2. Choose positive numbers p, q > 1 such 
that pq/2 — p — q > 1. Let u>o = 0, and define Wj inductively by Wj+i = 

je p /K T . Denote Nq = K T r/e p + 1, and redefine wn = t. 

Let w n> i = iV~ r and Af w = {w n ,i ■ i = 0,1, ■ ■ ■ ,[t ■ V£\ + 1}- For state- 
ment simplicity, assume that V n takes a constant value. Then 



/ / Pn (duds) > 2 = 0(VZ)0(V- 2r ) = 0(V~ 2r+2 ). 

JO J U+W„ i J 



rr ru+w n> i + i 
'0 Ju+w n:i 

From Condition B(iii), it follows that 



(6-5) P sup \W n , w , T - W n , Wn!i , T \ > 2V~ 1 / 2 + K T V~ r+l 

\i,w„ t i<w<Wn,i+i . 



(pt pu-fW n ,i+l \ 
sup / / p n {duds) > 2 

i JO Ju+w n}i J 

(PT PU+W n) i+l N 
sup/ / q n {duds) > K T V~ r+1+1 ' 2 

i JO Ju+w„i 



i JO Ju+w U}l 

< 0{V-' r+2 ) + P{R T >V^ l l 2 ), 

which converges to in probability, according to Condition B(ii), when r > 
2. 

Therefore, to prove Proposition 6.2, by (6.5) and the martingale property 
for {Wn wtfjJ^nrrf, < # < r} along entry time, we only need to show that 
for any e > 0, 

PI max sup \W n w $ — W n w . #1 > e I — > 0, as n — > co. 

\0<j<N o<t><T ' J ' J 

For simplicity, we only need to consider the case in which equation (3.1) 
holds almost surely. Then, by Doob's inequality and (6.4), similar as in the 
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proof of Proposition 6.1, 



< 



< 



1 y e 

3=0 

-T (— 

€ q \q-\ 



P I max sup 

\0<j<N o o<tf<r 

No-1 

sup 

'J— U ™e[iDj .tuj^j^JnAfu, 



i=0 



SUp ^ |W„ )U , )T - W n , WjjT \ 

we [wj ,wj+i]nAf w 



Since W nyWni+1 ^ - W n>VJn .^ is a {J" n , T) ^,i? > 0} martingale, from (6.4) 
and (3.1) which holds almost surely now, we have 



mw n , Wn , i+1 ,T-w n , Wni , T \<i) 



< C q I E[(W n , Wn . +1 , - W n>Wni ,)(r)}^ + 



yq/2 



c 



o 



K T • l(x < W^i+l) - K T • l(x < W n j) 



dx 



q/2 



where C is some big constant. Then from Lemma 6.4, there exists C* > 
such that for all large n, 



1 N °~ 1 / \ q 
( 6 - 6 ) ^Ef^Tl^f SU P |Wn, W ,r-Wn, 

6 ~^ WK^+iJnAC 



3=0 
No-1 



< 



\ 9/2 



3=0 

< C*(2e) pq/2 ~ p ~ 9 



where = max{i : w n ^ < Wj}. Thus (6.6) < e when e is small enough. 



Then the desired conclusion follows. 



□ 



References. 

[1] Andersen, P. K., Borgan, 0., Gill, R. D. and Keiding, N. (1993). Statistical 
Models Based on Counting Processes. Springer. 

[2] Arjas, E. and Haara, P. (1984). A marked point process approach to censored failure 
data with complicated covariates. Scandinavian Journal of Statistics 11 193-209. 

[3] Arjas, E. (1989). Survival models and martingale dynamics (with discussion). Scan- 
dinavian Journal of Statistics 16 177-225. 



32 



LUO, XU AND YING 



[4] Bilias, Y., Gu, M. and Ying, Z. (1997). A general asymptotic theory for cox model 

with staggered entry. The Annals of Statistics 25 662-682. 
[5] Cox, D. R. (1972). Regression Models and Life- Tables. Journal of the Royal Statistical 

Society. Series B (Methodological) 34 187-220. 
[6] Cui, L., Hung, H. M. J. and Wang, S. J. (1999). Modification of sample size in 

group sequential trials. Biometrics 55 853-857 
[7] Doob, J. L. (1953). Stochastic Processes. Wiley, New York. 

[8] Feng, J. (1999). Estimations in Survival Analysis - A Stochastic Filtering Perspective. 
Applied Probability Trust October 28. 

[9] Fisher, L. (1998). Self- designing clinical trials. Statistics in Medicine 17 1551-1562. 

[10] Fleming, T. R. and Harrington, D. (1991). Counting Processes and Survival Anal- 
ysis. John Wiley & Sons, Inc. 

[11] Friedman, L. M., Furberg, C. D. and DeMets, D. L. (1998). Fundamentals of 
Clinical Trials (3rd edition). Springer- Verlag, New York. 

[12] Gu, M. G. and Lai, T. L. (1991). Weak convergence of time-sequential censored 
rank statistics with applications to sequential testing in clinical trials. The Annals of 
Statistics 19 1403-1433. 

[13] Hu, F. and Rosenberger, W. F. (2006). The theory of response-adaptive random- 
ization in clinical trials. John Wiley & Sons, Inc. 

[14] Hu, F., Zhang, L. and He, X. (2009). Efficient randomized-adaptive designs. The 
Annals of Statistics 37 2543-2560. 

[15] Jacod, J. and Shiryaev, A. N. (2003). Limit Theorems for Stochastic Processes 
(2nd edition). Springer. 

[16] Jennison, C. and Turnbull, B. W. (2000). Group Sequential Methods with Appli- 
cations to Clinical Trials. Chapman & Hall/CRC. 

[17] Kallianpur, G. and XlONG, J. (1995). Stochastic Differential Equations in Infinite 
Dimensional Spaces. IMS Lecture Notes. 

[18] Kalbfleisch, J. D. and Prentice, R. L. (2002). The Statistical Analysis of Failure 
Time Data. New York: Wiley. 

[19] Lai, T. L. and Siegmund, D. (1983). Fixed Accuracy Estimation of an Autoregres- 
sive Parameter. The Annals of Statistics 11 478-485. 

[20] LAN, K. K. G. and DeMets, D. L. (1983). Discrete sequential boundaries for clinical 
trials. Biometrika 70 659-663. 

[21] Last, G. and Brandt, A. (1995). Marked Point Processes on the Real Line: The 
Dynamic Approach. Springer. 

[22] Lenglart, E., Lepingle, D. and Pratelli, M. (1980). Presentation unifiee de 
certaines inegualites de la theorie des martingales. Seminaire de Probabilities. Lecture 
Notes in Math 14 26-48. Springer, Berlin. 

[23] Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data 
from retrospective studies of disease. Journal of the National Cancer Institute 22 
719-48. 

[24] Martinussen, T. and Scheike, T. H. (2000). A nonparametric dynamic additive 
regression model for longitudinal data. The Annals of Statistics 28 1000-1025. 

[25] Martinussen, T. and Scheike, T. H. (2006). Dynamic Regression Models for Sur- 
vival Data. Springer. 

[26] Murphy, S. A. and Bingham, D. (2008). Screening Experiments for Developing 
Dynamic Treatment Regimes. J. Amer. Statist. Assoc. 184 391-408. 

[27] O'Brien, P. C. and Fleming, T. R. (1979). a multiple testing procedure for clinical 
trials. Biometrics 35 549-556. 

[28] POCOCK, S. J. (1977). Group sequential methods in the design and analysis of clinical 



A FRAMEWORK FOR SEQUENTIAL METHODS 



33 



trials. Biometrika 64 191-199. 

[29] POCOCK, S. J. and Simon, R. (1975). Sequential treatment assignment with balanc- 
ing for prognostic factors in the controlled clinical trial. Biometrics 31 103-115. 

[30] Proschan, M. A., LAN, K. K. G. and Wittes, J. T. (2006). Statistical Monitoring 
of Clinical Trials: A Unified Approach. Springer. 

[31] Rebolledo, R. (1980). Central limit theorem for local martingales. Z. Wahr. verw. 
Geb. 51 269-286. 

[32] Robins, J. M. (1986). A new approach to causal inference in mortality studies with 

sustained exposure periods-application to control of the healthy worker survivor effect. 

Computers and Mathematics with Applications 14 1393-1512. 
[33] Rosenberger, W. F. and Sverdlov, O. (2008). Handling covariates in the design 

of clinical trials. Statistical Science 23 404-419. 
[34] Sellke, T. and Siegmund, D. (1983). Sequential analysis of the proportional hazards 

model. Biometrika 70 315-26. 
[35] Shen, Y. and Fisher, L. (1999). Statistical inference for self-designing clinical trials 

with a one-sided hypothesis. Biometrics 55 190-197. 
[36] Shen, Y. and Cai, J. (2003). Sample size reestimation for clincal trials with censored 

survival data. J. Amer. Statist. Assoc. 98 418-426. 
[37] Slud, E. (1984). Sequential linear rank tests for two-sample censored survival data. 

The Annals of Statistics 12 551-571. 
[38] Wei, L. J. (1978). The adaptive biased coin design for sequential experiments. The 

Annals of Statistics 6 92-100. 
[39] Wei, L. J. and Durham, S. (1978). The randomized play-the-winner rule in medical 

trials. J. Amer. Statist. Assoc. 73 840-843. 
[40] Zelen, M. (1969). Play the winner and the controlled clinical trial. J. Amer. Statist. 

Assoc. 64 131-146. 



XlAOLONG LUO 

Biostatistics, Clinical R&D 
Celgene Corporation 
86 Morris Avenue 
Summit, NJ 07901 



gongjun xu and zhiliang ylng 

Department of Statistics 

Columbia University 

1255 Amsterdam Avenue 

New York, NY 10027 

E-mail : gongj un@stat . columbia.edu 



